Generative Adversarial Networks (GAN)


By Prof. Seungchul Lee
http://iai.postech.ac.kr/
Industrial AI Lab at POSTECH

Table of Contents

1. Discriminative Model v.s. Generative Model

  • Discriminative model




  • Cenerative model



2. Density Function Estimation

  • Probability density function estimation problem
  • If $P_{\text{model}}(x)$ can be estimated as close to $P_{\text{data}}(x)$, then data can be generated by sampling from $P_{\text{model}}(x)$.

    • Note: Kullback–Leibler Divergence is a kind of distance measure between two distributions
  • Learn determinstic transformation via a neural network
    • Start by sampling the code vector $z$ from a simple, fixed distribution such as a uniform distribution or a standard Gaussian $\mathcal{N}(0,I)$
    • Then this code vector is passed as input to a deterministic generator network $G$, which produces an output sample $x=G(z)$
    • This is how a neural network plays in a generative model (as a nonlinear mapping to a target probability density function)



- An example of a generator network which encodes a univariate distribution with two different modes

  • Generative model of high dimensional space

3. Generative Adversarial Networks (GAN)

  • In generative modeling, we'd like to train a network that models a distribution, such as a distribution over images.

  • GANs do not work with any explicit density function !

  • Instead, take game-theoretic approach

  • One way to judge the quality of the model is to sample from it.

  • Model to produce samples which are indistinguishable from the real data, as judged by a discriminator network whose job is to tell real from fake






- The idea behind Generative Adversarial Networks (GANs): train two different networks
- Discriminator network: try to distinguish between real and fake data

- Generator network: try to produce realistic-looking samples to fool the discriminator network

4. GAN with MNIST

4.1. GAN Implementation

In [1]:
import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
In [2]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
In [3]:
mnist = tf.keras.datasets.mnist

(train_x, train_y), (test_x, test_y) = mnist.load_data()

train_x, test_x = train_x/255.0, test_x/255.0
train_x = train_x.reshape(-1, 784)
test_x = test_x.reshape(-1, 784)

print('train_x: ', train_x.shape)
print('test_x: ', test_x.shape)
train_x:  (60000, 784)
test_x:  (10000, 784)
In [4]:
generator = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units = 256, input_dim = 100, activation = 'relu'),
    tf.keras.layers.Dense(units = 784, activation = 'sigmoid')    
])
In [5]:
discriminator = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units = 256, input_dim = 784, activation = 'relu'),
    tf.keras.layers.Dense(units = 1, activation = 'sigmoid'),
])
In [6]:
discriminator.compile(optimizer = tf.keras.optimizers.Adam(learning_rate = 0.0001), 
                      loss = 'binary_crossentropy')
In [7]:
combined_input = tf.keras.layers.Input(shape = (100,))
generated = generator(combined_input)
discriminator.trainable = False
combined_output = discriminator(generated)

combined = tf.keras.models.Model(inputs = combined_input, outputs = combined_output)
In [8]:
combined.compile(optimizer = tf.keras.optimizers.Adam(learning_rate = 0.0002), 
                 loss = 'binary_crossentropy')
In [9]:
def make_noise(samples):
    return np.random.normal(0, 1, [samples, 100])
In [10]:
def plot_generated_images(generator, samples = 3):
    
    noise = make_noise(samples)
    
    generated_images = generator.predict(noise)
    generated_images = generated_images.reshape(samples, 28, 28)
    
    for i in range(samples):
        plt.subplot(1, samples, i+1)
        plt.imshow(generated_images[i], 'gray', interpolation = 'nearest')
        plt.axis('off')
        plt.tight_layout()
    plt.show()
In [11]:
n_iter = 50000
batch_size = 100

fake = np.zeros(batch_size)
real = np.ones(batch_size)

for i in range(n_iter):
        
    # Train Discriminator
    noise = make_noise(batch_size)
    generated_images = generator.predict(noise)

    idx = np.random.randint(0, train_x.shape[0], batch_size)
    real_images = train_x[idx]
    
    D_loss_real = discriminator.train_on_batch(real_images, real)
    D_loss_fake = discriminator.train_on_batch(generated_images, fake)
    D_loss = D_loss_real + D_loss_fake
    
    # Train Generator
    noise = make_noise(batch_size)    
    G_loss = combined.train_on_batch(noise, real)
    
    if i % 5000 == 0:
        
        print('Discriminator Loss: ', D_loss)
        print('Generator Loss: ', G_loss)

        plot_generated_images(generator)
Discriminator Loss:  1.5595218539237976
Generator Loss:  0.7863957285881042
Discriminator Loss:  0.25488316267728806
Generator Loss:  2.4399359226226807
Discriminator Loss:  0.3470231890678406
Generator Loss:  2.2872633934020996
Discriminator Loss:  0.5063828378915787
Generator Loss:  2.1873669624328613
Discriminator Loss:  0.5098372101783752
Generator Loss:  1.9037092924118042
Discriminator Loss:  0.5076103210449219
Generator Loss:  2.3944220542907715
Discriminator Loss:  1.7282211184501648
Generator Loss:  0.9864964485168457
Discriminator Loss:  0.6391394138336182
Generator Loss:  2.163146734237671
Discriminator Loss:  0.371719628572464
Generator Loss:  2.474647283554077
Discriminator Loss:  0.38926680386066437
Generator Loss:  2.30230712890625

4.2. After Training

  • After training, use the generator network to generate new data


In [12]:
plot_generated_images(generator)
In [13]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')